140 research outputs found
RecurSeed and EdgePredictMix: Single-stage Learning is Sufficient for Weakly-Supervised Semantic Segmentation
Although weakly-supervised semantic segmentation using only image-level
labels (WSSS-IL) is potentially useful, its low performance and implementation
complexity still limit its application. The main causes are (a) non-detection
and (b) false-detection phenomena: (a) The class activation maps refined from
existing WSSS-IL methods still only represent partial regions for large-scale
objects, and (b) for small-scale objects, over-activation causes them to
deviate from the object edges. We propose RecurSeed which alternately reduces
non and false-detections through recursive iterations, thereby implicitly
finding an optimal junction that minimizes both errors. We also propose a novel
data augmentation (DA) approach called EdgePredictMix, which further expresses
an object's edge by utilizing the probability difference information between
adjacent pixels in combining the segmentation results, thereby compensating for
the shortcomings when applying the existing DA methods to WSSS. We achieved new
state-of-the-art performances on both the PASCAL VOC 2012 and MS COCO 2014
benchmarks (VOC val 74.4%, COCO val 46.4%). The code is available at
https://github.com/OFRIN/RecurSeed_and_EdgePredictMix
Scale-invariant Bayesian Neural Networks with Connectivity Tangent Kernel
Explaining generalizations and preventing over-confident predictions are
central goals of studies on the loss landscape of neural networks. Flatness,
defined as loss invariability on perturbations of a pre-trained solution, is
widely accepted as a predictor of generalization in this context. However, the
problem that flatness and generalization bounds can be changed arbitrarily
according to the scale of a parameter was pointed out, and previous studies
partially solved the problem with restrictions: Counter-intuitively, their
generalization bounds were still variant for the function-preserving parameter
scaling transformation or limited only to an impractical network structure. As
a more fundamental solution, we propose new prior and posterior distributions
invariant to scaling transformations by \textit{decomposing} the scale and
connectivity of parameters, thereby allowing the resulting generalization bound
to describe the generalizability of a broad class of networks with the more
practical class of transformations such as weight decay with batch
normalization. We also show that the above issue adversely affects the
uncertainty calibration of Laplace approximation and propose a solution using
our invariant posterior. We empirically demonstrate our posterior provides
effective flatness and calibration measures with low complexity in such a
practical parameter transformation case, supporting its practical effectiveness
in line with our rationale
Domain Adaptive Transfer Attack (DATA)-based Segmentation Networks for Building Extraction from Aerial Images
Semantic segmentation models based on convolutional neural networks (CNNs)
have gained much attention in relation to remote sensing and have achieved
remarkable performance for the extraction of buildings from high-resolution
aerial images. However, the issue of limited generalization for unseen images
remains. When there is a domain gap between the training and test datasets,
CNN-based segmentation models trained by a training dataset fail to segment
buildings for the test dataset. In this paper, we propose segmentation networks
based on a domain adaptive transfer attack (DATA) scheme for building
extraction from aerial images. The proposed system combines the domain transfer
and adversarial attack concepts. Based on the DATA scheme, the distribution of
the input images can be shifted to that of the target images while turning
images into adversarial examples against a target network. Defending
adversarial examples adapted to the target domain can overcome the performance
degradation due to the domain gap and increase the robustness of the
segmentation model. Cross-dataset experiments and the ablation study are
conducted for the three different datasets: the Inria aerial image labeling
dataset, the Massachusetts building dataset, and the WHU East Asia dataset.
Compared to the performance of the segmentation network without the DATA
scheme, the proposed method shows improvements in the overall IoU. Moreover, it
is verified that the proposed method outperforms even when compared to feature
adaptation (FA) and output space adaptation (OSA).Comment: 11pages, 12 figure
Improved Chest Anomaly Localization without Pixel-level Annotation via Image Translation Network Application in Pseudo-paired Registration Domain
Image translation based on a generative adversarial network (GAN-IT) is a
promising method for the precise localization of abnormal regions in chest
X-ray images (AL-CXR) even without pixel-level annotation. However,
heterogeneous unpaired datasets undermine existing methods to extract key
features and distinguish normal from abnormal cases, resulting in inaccurate
and unstable AL-CXR. To address this problem, we propose an improved two-stage
GAN-IT involving registration and data augmentation. For the first stage, we
introduce an advanced deep-learning-based registration technique that virtually
and reasonably converts unpaired data into paired data for learning
registration maps, by sequentially utilizing linear-based global and uniform
coordinate transformation and AI-based non-linear coordinate fine-tuning. This
approach enables the independent and complex coordinate transformation of each
detailed location of the lung while recognizing the entire lung structure,
thereby achieving higher registration performance with resolving inherent
artifacts caused by unpaired conditions. For the second stage, we apply data
augmentation to diversify anomaly locations by swapping the left and right lung
regions on the uniform registered frames, further improving the performance by
alleviating imbalance in data distribution showing left and right lung lesions.
The proposed method is model agnostic and shows consistent AL-CXR performance
improvement in representative AI models. Therefore, we believe GAN-IT for
AL-CXR can be clinically implemented by using our basis framework, even if
learning data are scarce or difficult for the pixel-level disease annotation
- …